Multiple Structural RNA Alignment with Lagrangian Relaxation

نویسندگان

  • Markus Bauer
  • Gunnar W. Klau
  • Knut Reinert
چکیده

In contrast to proteins, many classes of functionally related RNA molecules show a rather weak sequence conservation but instead a fairly well conserved secondary structure. Hence it is clear, that any method that relates RNA sequences in form of (multiple) alignments should take structural features into account. Since multiple alignments are of great importance for subsequent data analysis, research in improving the speed and accuracy of such alignments benefits many other analysis problems. We present a formulation for computing provably optimal, structurebased, multiple RNA alignments and give an algorithm that finds such an optimal solution or at least a very good approximation of it. Our formulation is based on the structural trace formulation of Reinert et al. and uses a recently proposed weighting function of Hofacker et al. that makes use of McCaskill’s approach to compute base pair probability functions. To solve the resulting computational problem we propose an algorithm based on Lagrangian relaxation which already proved useful in the two-sequence case. We compare our implementation, mLARA, to two recent programs (MARNA and pmmulti) and demonstrate that we can compute multiple alignments with consensus structures that have a significant lower minimum free energy term than computed by the other programs. Our prototypical experiments indicate that our new algorithm is the first approach to successfully compute provably optimal multiple structural alignments in reasonable computation time. Further advantages are its applicablity to long sequences where standard dynamic programming approaches must fail and its ability to deal with pseudoknot structures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Structural Rna Alignment with Affine Gap Costs Based on Lagrangian Relaxation

In this thesis the structural alignment of RNA sequences is addressed, a topic of crucial significance in the field of computational biology. Contrary to alignments of DNA, alignments of RNA are not only aligned based on sequence information, but largely depend on the correct structural alignment. Since the functions of RNA depend mostly on its secondary structure and this is highly conserved t...

متن کامل

Rna Structural Alignment by Means of Lagrangian Relaxation

This thesis deals with an important topic in computational molecular biology: structurally correct alignments of RNA sequences. Compared to DNA sequences where sequence information is normally sufficient for adequate alignments, the structural aspects of RNA have to be taken into account when dealing with RNA sequences: structure of RNA sequences tends to remain conserved throughout evolution. ...

متن کامل

Fast and Accurate Structural RNA Alignment by Progressive Lagrangian Optimization

During the last few years new functionalities of RNA have been discovered, renewing the need for computational tools for their analysis. To this respect, multiple sequence alignment is an essential step in finding structurally conserved regions in related RNA sequences. In contrast to proteins, many classes of functionally related RNA molecules show a rather weak sequence conservation but inste...

متن کامل

Structural Alignment of Two RNA Sequences with Lagrangian Relaxation

RNA is generally a single-stranded molecule where the bases form hydrogen bonds within the same molecule leading to structure formation. In comparing different homologous RNA molecules it is usually not sufficient to consider only the primary sequence, but it is important to consider both the sequence and the structure of the molecules. Traditional alignment algorithms can only account for the ...

متن کامل

A Lagrangian Relaxation Approach for the Multiple Sequence Alignment Problem

We present a branch-and-bound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing certain inequalities, the Lagrangian subproblem becomes a pairwise alignment problem, which can be solved ef...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005